Bug#872277: patched version needed to avoid fatal: Out of memory, getdelim failed crash
Hi Jonathan -- thank you very much for looking into it > > Subject: patched version needed to avoid fatal: Out of memory, getdelim > > failed crash > You could say that about all bugs. A patched version is needed to fix > the bug. :) ;) indeed... should have said "attached patch is needed to ..." ;-) > [...] > > For a while this issue was haunting us in various deployments, in > > particular > > - on NFS mounts > > - within standalone build of git-annex > > that many git commands could crash randomly at various points with > > fatal: Out of memory, getdelim failed > > message. Recently upstream tracked it down, and offered a perspective patch > > (see attached, cherry-picked from git upstream repository) -- it is a very > > simple patch. I have tested it on our deployments (as applied against > > 1:2.11.0-3 in stretch, but it is applicable to sid/experimental version as > > well) > > original report upstream and the thread with further discussion(s) on > > possible improvements to the patch: > > https://public-inbox.org/git/20170809173928.h2ylvg5tp2p5i...@hopa.kiewit.dartmouth.edu/ > > Please please consider adapting this patch for the official packages of git > > in > > Debian sid, and ideally to the version in stretch. This would be of great > > help > > to avoid those problems on the NFS-mounts and to generate fixed standalone > > builds of git-annex (CCing git-annex author, Joey) which many users rely > > upon > > on various (even non-Debian) systems. > Thanks for writing. Can you say more about the impact? devastation... pain ... suffering... really -- somehow it started to show up more for us recently and since no easy workaround -- I just had to provide a patched version to get around. In datalad we use git extensively so we run into higher chance to hit it because git is likely to be invoked multiple times per 1 datalad command. Just now, my previous email was to a student reporting the issue while trying to tame git. Also running it within singularity container also seems to make this bug more reproducible -- and we are switching to use Debian singularity container for many computations on our CentOS HPC... so although very basic, its impact is large and it greatly reduces our quality of life ;) > E.g. is there a > reproduction recipe I can use to experience this on stable? How often > do people experience this, and do they have a workaround? (un)fortunately so far I just had experienced it within stable debian singularity environment (such as our neurodebian one, e.g. this one (http://datasets.datalad.org/singularity/neurodebian-v2.2.img.tgz, it is huge etc), must be on a bind-mount of NFS-mounted partition (not sure if specific mount options are in play here as well). but because issue is kinda ill-messaged (has actually nothing to do with insufficient memory as far as I see it), I think it also misguides some folks: quick google lead to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=842586 which was closed since on the next run build succeeded... I do not think that there is a workaround besides a patch or rebuild (with HAVE_GETDELIM=) > The patch also appeared to still be under discussion upstream. Someone > investigating how the standard and various platforms handle errors from > getdelim would likely be appreciated there. yeah > One possibility in the Debian packaging would be for us to pass > HAVE_GETDELIM= in our build flags to avoid using getdelim altogether. > That way, we get correct behavior while waiting for the upstream > discussion to settle. whatever you think would be more appropriate is good with me, as long as we do not keep hitting this issue ;-) I just know that initial patch seems have resolved issue for me, and I haven't tried building without that functionality (i.e. with HAVE_GETDELIM=) so not sure if that one wouldn't have any side-effects if upstream largely builds/tests with it -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik
Bug#872277: patched version needed to avoid fatal: Out of memory, getdelim failed crash
retitle 872277 git: spurious "fatal: Out of memory, getdelim failed" forwarded 872277 https://public-inbox.org/git/20170809173928.h2ylvg5tp2p5i...@hopa.kiewit.dartmouth.edu/#r # bug introduced in v2.5.0-rc0~24^2~4 (strbuf_getwholeline: use getdelim # if it is available, 2015-04-16) found 872277 git/1:2.5.0-1 quit Hi, Yaroslav Halchenko wrote: > Subject: patched version needed to avoid fatal: Out of memory, getdelim > failed crash You could say that about all bugs. A patched version is needed to fix the bug. :) [...] > For a while this issue was haunting us in various deployments, in particular > - on NFS mounts > - within standalone build of git-annex > that many git commands could crash randomly at various points with > > fatal: Out of memory, getdelim failed > > message. Recently upstream tracked it down, and offered a perspective patch > (see attached, cherry-picked from git upstream repository) -- it is a very > simple patch. I have tested it on our deployments (as applied against > 1:2.11.0-3 in stretch, but it is applicable to sid/experimental version as > well) > > original report upstream and the thread with further discussion(s) on > possible improvements to the patch: > https://public-inbox.org/git/20170809173928.h2ylvg5tp2p5i...@hopa.kiewit.dartmouth.edu/ > > Please please consider adapting this patch for the official packages of git in > Debian sid, and ideally to the version in stretch. This would be of great > help > to avoid those problems on the NFS-mounts and to generate fixed standalone > builds of git-annex (CCing git-annex author, Joey) which many users rely upon > on various (even non-Debian) systems. Thanks for writing. Can you say more about the impact? E.g. is there a reproduction recipe I can use to experience this on stable? How often do people experience this, and do they have a workaround? The patch also appeared to still be under discussion upstream. Someone investigating how the standard and various platforms handle errors from getdelim would likely be appreciated there. One possibility in the Debian packaging would be for us to pass HAVE_GETDELIM= in our build flags to avoid using getdelim altogether. That way, we get correct behavior while waiting for the upstream discussion to settle. Thanks, Jonathan
Bug#872277: patched version needed to avoid fatal: Out of memory, getdelim failed crash
Package: git Version: 1:2.14.0-1 Severity: important Tags: upstream patch Dear git maintainers, For a while this issue was haunting us in various deployments, in particular - on NFS mounts - within standalone build of git-annex that many git commands could crash randomly at various points with fatal: Out of memory, getdelim failed message. Recently upstream tracked it down, and offered a perspective patch (see attached, cherry-picked from git upstream repository) -- it is a very simple patch. I have tested it on our deployments (as applied against 1:2.11.0-3 in stretch, but it is applicable to sid/experimental version as well) original report upstream and the thread with further discussion(s) on possible improvements to the patch: https://public-inbox.org/git/20170809173928.h2ylvg5tp2p5i...@hopa.kiewit.dartmouth.edu/ Please please consider adapting this patch for the official packages of git in Debian sid, and ideally to the version in stretch. This would be of great help to avoid those problems on the NFS-mounts and to generate fixed standalone builds of git-annex (CCing git-annex author, Joey) which many users rely upon on various (even non-Debian) systems. Thank you very much in advance -- System Information: Debian Release: 9.0 APT prefers unstable APT policy: (600, 'unstable'), (300, 'experimental'), (100, 'unstable-debug'), (100, 'stable') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 4.9.0-2-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages git depends on: ii dpkg 1.18.24 ii git-man 1:2.14.0-1 ii libc62.24-10 ii libcurl3-gnutls 7.51.0-1 ii liberror-perl0.17024-1 ii libexpat12.2.0-2 ii libpcre2-8-0 10.22-3 ii perl 5.26.0-5 ii zlib1g 1:1.2.8.dfsg-5 Versions of packages git recommends: ii less 481-2.1 ii openssh-client [ssh-client] 1:7.4p1-10 ii patch2.7.5-1+b2 Versions of packages git suggests: ii gettext-base 0.19.8.1-2 pn git-cvs pn git-daemon-run | git-daemon-sysvinit pn git-doc ii git-el1:2.11.0-3 ii git-email 1:2.14.0-1 ii git-gui 1:2.14.0-1 pn git-mediawiki ii git-svn 1:2.14.0-1 ii gitk 1:2.14.0-1 pn gitweb -- no debconf information >From 642956cf455ff8635be32b3160b12369da73cfe2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ren=C3=A9=20Scharfe?=Date: Thu, 10 Aug 2017 22:56:40 +0200 Subject: [PATCH] strbuf: clear errno before calling getdelim(3) getdelim(3) returns -1 at the end of the file and if it encounters an error, but sets errno only in the latter case. Set errno to zero before calling it to avoid misdiagnosing an out-of-memory condition due to a left-over value from some other function call. Reported-by: Yaroslav Halchenko Suggested-by: Junio C Hamano Signed-off-by: Rene Scharfe Signed-off-by: Junio C Hamano --- strbuf.c | 1 + 1 file changed, 1 insertion(+) diff --git a/strbuf.c b/strbuf.c index ace58e7367..977acba31b 100644 --- a/strbuf.c +++ b/strbuf.c @@ -472,6 +472,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) /* Translate slopbuf to NULL, as we cannot call realloc on it */ if (!sb->alloc) sb->buf = NULL; + errno = 0; r = getdelim(>buf, >alloc, term, fp); if (r > 0) { -- 2.14.0