Bug#800523: nscd netgroup cache occasionally not updated, nscd -i netgroup hangs

2015-12-17 Thread Aurelien Jarno
On 2015-09-30 12:29, Mike Gabriel wrote:
> Package: nscd
> Severity: important
> Version: 2.19-18+deb8u1
> Tags: patch
> Usertags: debian-edu
> User: debian-...@lists.debian.org
> X-Debbugs-Cc: debian-...@lists.debian.org
> 
> Dear maintainers,
> 
> the Debian Edu main server in jessie heavily relies on working netgroups
> code in glibc / nscd for allowing NFS access from hosts on the network.
> 
> The setup is:
> 
> /etc/nsswitch.conf:
> 
> """
> netgroup:files ldap
> """
> 
> The LDAP nss provider is libnss-ldapd via nslcd. NIS netgroups are only
> configured in LDAP, no local /etc/netgroup file is present. NIS netgroup
> caching in nscd.conf is enabled.
> 
> In some cases (unclear what triggers it) the following is observed:
> 
>   o add a new host to the NIS netgroup "workstation-hosts"
>   o wait for a while (i.e., we even tried days...)
>   o "getent netgroup workstation-hosts" does not list the new host as
> netgroup member
>   o trying
> 
> $ innetgr -h  workstation-hosts || echo FALSE
> 
> echoes "FALSE" on the terminal.
>   o sometimes there even is a difference between what getent netgroup
>  gives
> as a result and what innetgr returns as a result (a host tripled is
> listed in
> getent netgroup , but when querying for that host via innetgr).
>   o Attempting cache clean-up (nscd -i netgroup) fails, the command hangs and
> does not return to a command prompt
> 
> The behaviour occurs very often on Debian Edu jessie main server
> installations (and also on a vanilla Debian jessie server using a similar
> NIS netgroup / NFS setup). It does not occur always. Note, that I always
> have host netgroups that are full with host triplets (long strings!!!
> several lines on a normal 80x25 terminal).
> 
> From looking at debdiffs between glibc in unstable and jessie
> (2.19-18+deb8u1), the issue is probably also present in Debian unstable, but
> may have been fixed in glibc 2.21 (currently in experimental).

Given I don't have a test setup to reproduce the issue, and now that
2.21 is in testing, it would be nice if you can give a try with this
version to see if it improves things. That will at least tell us if we
have to look at patches to backports or at writing patches to fix the
issues.

> The debdiff between glibc in wheezy (2.13-38+deb7u8) and jessie
> (2.19-18+deb8u1) alludes that the changes around the netgroup caching code
> (there have been quite some nscd caching changes between those two version)
> may have caused this issue between glibc 2.13 and 2.19.
> 
> The above issue is definitely not present in glibc from Debian squeeze (we
> have many servers running that versions) and probably neither present in
> Debian wheezy (only one test server deployed), but really really bites us
> (the Debian Edu team) on Debian Edu jessie.
> 
> The workaround at the moment is: disable nscd netgroup caching in nscd.conf.
> This is by far suboptimal.
> 
> Upstream observed issues with (LDAP and) netgroup caching, as well, recently:
> 
>   https://sourceware.org/bugzilla/show_bug.cgi?id=16878
>   Patch: 
> https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commitdiff;h=c3ec475c5dd16499aa040908e11d382c3ded9692;hp=aa2f176d6f75b86b91e544c2e494066ac8f88cbd

This has already been backported to jessie.

>   https://sourceware.org/bugzilla/show_bug.cgi?id=16760
> https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=dd3022d75e6fb8957843d6d84257a5d8457822d5

This one is actually from BZ 16759

> https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ea7d8b95e2fcb81f68b04ed7787a3dbda023991a

It looks indeed a good idea to backport them.

>   https://sourceware.o rg/bugzilla/show_bug.cgi?id=16695
> https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c44496df2f090a56d3bf75df930592dac6bba46f

This has already been backported to jessie.
 
>   https://sourceware.org/bugzilla/show_bug.cgi?id=16758
> https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=fbd6b5a4052316f7eb03c4617eebfaafc59dcc06
> 
> Especially BZ #16878 looks like being a good candidate to fix this. My
> recommendation is considering backporting all of the above patches as by
> reading those bug reports, glibc 2.19 seems quite buggy regarding netgroup
> caching in nscd.

I will try to backport them for the next point release.

Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net


signature.asc
Description: Digital signature


Bug#800523: nscd netgroup cache occasionally not updated, nscd -i netgroup hangs

2015-09-30 Thread Mike Gabriel

Package: nscd
Severity: important
Version: 2.19-18+deb8u1
Tags: patch
Usertags: debian-edu
User: debian-...@lists.debian.org
X-Debbugs-Cc: debian-...@lists.debian.org

Dear maintainers,

the Debian Edu main server in jessie heavily relies on working  
netgroups code in glibc / nscd for allowing NFS access from hosts on  
the network.


The setup is:

/etc/nsswitch.conf:

"""
netgroup:files ldap
"""

The LDAP nss provider is libnss-ldapd via nslcd. NIS netgroups are  
only configured in LDAP, no local /etc/netgroup file is present. NIS  
netgroup caching in nscd.conf is enabled.


In some cases (unclear what triggers it) the following is observed:

  o add a new host to the NIS netgroup "workstation-hosts"
  o wait for a while (i.e., we even tried days...)
  o "getent netgroup workstation-hosts" does not list the new host as  
netgroup member

  o trying

$ innetgr -h  workstation-hosts || echo FALSE

echoes "FALSE" on the terminal.
  o sometimes there even is a difference between what getent netgroup  
 gives
as a result and what innetgr returns as a result (a host tripled  
is listed in

getent netgroup , but when querying for that host via innetgr).
  o Attempting cache clean-up (nscd -i netgroup) fails, the command hangs and
does not return to a command prompt

The behaviour occurs very often on Debian Edu jessie main server  
installations (and also on a vanilla Debian jessie server using a  
similar NIS netgroup / NFS setup). It does not occur always. Note,  
that I always have host netgroups that are full with host triplets  
(long strings!!! several lines on a normal 80x25 terminal).


From looking at debdiffs between glibc in unstable and jessie  
(2.19-18+deb8u1), the issue is probably also present in Debian  
unstable, but may have been fixed in glibc 2.21 (currently in  
experimental).


The debdiff between glibc in wheezy (2.13-38+deb7u8) and jessie  
(2.19-18+deb8u1) alludes that the changes around the netgroup caching  
code (there have been quite some nscd caching changes between those  
two version) may have caused this issue between glibc 2.13 and 2.19.


The above issue is definitely not present in glibc from Debian squeeze  
(we have many servers running that versions) and probably neither  
present in Debian wheezy (only one test server deployed), but really  
really bites us (the Debian Edu team) on Debian Edu jessie.


The workaround at the moment is: disable nscd netgroup caching in  
nscd.conf. This is by far suboptimal.


Upstream observed issues with (LDAP and) netgroup caching, as well, recently:

  https://sourceware.org/bugzilla/show_bug.cgi?id=16878
  Patch:  
https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commitdiff;h=c3ec475c5dd16499aa040908e11d382c3ded9692;hp=aa2f176d6f75b86b91e544c2e494066ac8f88cbd


  https://sourceware.org/bugzilla/show_bug.cgi?id=16760
   
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=dd3022d75e6fb8957843d6d84257a5d8457822d5
   
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ea7d8b95e2fcb81f68b04ed7787a3dbda023991a


  https://sourceware.o rg/bugzilla/show_bug.cgi?id=16695
   
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c44496df2f090a56d3bf75df930592dac6bba46f


  https://sourceware.o rg/bugzilla/show_bug.cgi?id=16758
   
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=fbd6b5a4052316f7eb03c4617eebfaafc59dcc06


Especially BZ #16878 looks like being a good candidate to fix this. My  
recommendation is considering backporting all of the above patches as  
by reading those bug reports, glibc 2.19 seems quite buggy regarding  
netgroup caching in nscd.


Greets,
Mike


--

DAS-NETZWERKTEAM
mike gabriel, herweg 7, 24357 fleckeby
fon: +49 (1520) 1976 148

GnuPG Key ID 0x25771B31
mail: mike.gabr...@das-netzwerkteam.de, http://das-netzwerkteam.de

freeBusy:
https://mail.das-netzwerkteam.de/freebusy/m.gabriel%40das-netzwerkteam.de.xfb


pgp7k0zYj_9a_.pgp
Description: Digitale PGP-Signatur