Re: (should be) simple bind problem [possibly solved]
Glenn English wrote: apparmor. Ah! I would not have thought of that one. In the recent Debians (Wheezy++, I think), there is a directory /etc/apparmor.d. In there is a file called user.sbin.named. That Yes. But it isn't enabled by default. On a recently installed Debian Jessie 8 system: $ dpkg -l | grep apparmor $ Usually nothing is installed to start it. Perhaps something you installed pulled it in as a dependency? Looking I see one of my systems has libapparmor1 but it is still not enabled. So the presense of that one library would not be enough to start it. After reboot, and after waiting a few minutes, there are no new permission error entries in the log. I realize this is kind of far fetched, seeing how there was no apparmor startup in init.d, but this has been making me crazy, and I've tried many things that should have fixed it, so I'd do anything. I really don't know very much about apparmor. I found a note in the Debian wiki saying apparmor is installed by default on Wheezy and that it's started by GRUB. That might explain why I didn't find anything in init.d. I don't see it installed by default on the recently installed Jessie 8 system here. Just a data point. I wouldn't be surprised to find that something else (GNOME for example?) might pull it in as a dependency. I don't know when Bind slaves try to update the mod times on their zone files, but I'm pretty sure the master sends out refreshes to the slaves when the master restarts, so I restarted the master. Lots of entries in ns2's log about receiving notifies, but no permission errors. Everything depends upon the DNS zone serial number. When the master restarts it will send a notify. The slaves will get the notify and check the serial number against their cached copy. If the serial number is the same or older then nothing further happens. If the serial number is newer than their cached copy they will request a scheduled zone transfer. It won't happen immediately to prevent a storm of activity all at once synchronized by the notify. But within a randomized short time a scheduled zone transfer will then occur. Bob signature.asc Description: Digital signature
Re: (should be) simple bind problem [solved]
On May 26, 2015, at 11:28 PM, Glenn English g...@slsware.net wrote: apparmor. No permission probs in the log this morning. Thanks much to those with suggestions. -- Glenn English -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/93a3e0ca-b185-4c39-a5b3-77213f25b...@slsware.net
Re: (should be) simple bind problem [possibly solved]
On May 27, 2015, at 12:43 PM, Bob Proulx b...@proulx.com wrote: Ah! I would not have thought of that one. I didn't consider apparmor either. Saw a mention of it on an Ubuntu site. Yes. But it isn't enabled by default. I really don't think it is either. But simply renaming that file in the config directory and rebooting fixed the problem. Something's doing something somewhere. Usually nothing is installed to start it. Perhaps something you installed pulled it in as a dependency? Looking I see one of my systems has libapparmor1 but it is still not enabled. So the presense of that one library would not be enough to start it. I don't know; dependency's always a possibility. But these are servers, so they're pretty lean. I did everything I knew of, and many I didn't. I swear apparmor isn't on these machines. There's nothing in init.d, nothing in man, and BASH says the apparmor utilities don't exist. But the config info exists. And it's in the kernel. I don't see it installed by default on the recently installed Jessie 8 system here. Just a data point. I wouldn't be surprised to find that something else (GNOME for example?) might pull it in as a dependency. We don't do GNOME. XFCE's might have, but I doubt it. Everything depends upon the DNS zone serial number. When the master restarts it will send a notify. The slaves will get the notify and check the serial number against their cached copy. If the serial number is the same or older then nothing further happens. I think BIND might change the mod time of the zone file to reflect that it got a transfer, even if nothing changed. But within a randomized short time a scheduled zone transfer will then occur. Yup. That's why I waited overnight to proclaim the nasty fixed. I just went into one of the master zone files, added a char to it (and upped the serial), and restarted the master. The transfer of the changed zone file was logged by the slave server, but no error of any kind. The slave zone file has the updated serial and the change I made. All the slave zones files are dated today. I claim it's fixed. I'll be watching the DNS logs for a few days, though, just to be sure. -- Glenn English -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/5f1ab695-7c98-47cb-87ee-8f588e440...@slsware.net
Re: (should be) simple bind problem [possibly solved]
apparmor. In the recent Debians (Wheezy++, I think), there is a directory /etc/apparmor.d. In there is a file called user.sbin.named. That file does various things to the /var/cache/bind directory. I didn't look at it long enough to figure out just what it does, and I couldn't find apparmor on my system. But I figured it must be somewhere if that directory exists in /etc, so I renamed the ...named filename and rebooted (this was all on ns2, the RaspberryPi). After reboot, and after waiting a few minutes, there are no new permission error entries in the log. I realize this is kind of far fetched, seeing how there was no apparmor startup in init.d, but this has been making me crazy, and I've tried many things that should have fixed it, so I'd do anything. I found a note in the Debian wiki saying apparmor is installed by default on Wheezy and that it's started by GRUB. That might explain why I didn't find anything in init.d. I don't know when Bind slaves try to update the mod times on their zone files, but I'm pretty sure the master sends out refreshes to the slaves when the master restarts, so I restarted the master. Lots of entries in ns2's log about receiving notifies, but no permission errors. -- Glenn English -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/160ce3f4-dc1f-44ad-9b84-0af15a714...@slsware.net
Re: (should be) simple bind problem
On May 25, 2015, at 1:00 AM, Bob Proulx b...@proulx.com wrote: Glenn English wrote: root@srv:~# ps -ef | grep named bind 2098 1 0 May10 ?00:00:36 /usr/sbin/named -u bind root 10498 1 0 May10 ?00:00:50 /usr/sbin/named -c /etc/bind/named.conf There are two of them running? That doesn't seem right. The first one looks okay but the second one does not. I would be inclined to kill both of them. Then start it up again and check all over again. There are 3 servers involved, all running Wheezy, bind 9.8.4: srv, a Dell server box and the 'main' server for the domain -- dns (slave), email, web, ftp, ntp, imap, etc; ns2, the RaspberryPi secondary slave dns server; and log, the Supermicro server on the DMZ -- the only recursive DNS server, accessible only from the LAN and the slave DNS servers (it's called log because that's where things are logged). srv: Yesterday I killed and restarted the 2 bind processes and restarted the one owned by Bind. This morning, root's was back. But it seems to not be complaining about much today. I set the directory to mod 777 anyway. ns2: I found a possibly interesting thing in /var/cache: there's a 'bind' directory (owned by bind:bind) and a 'named' directory (owned by root:root). ns2 says it can't 'set file modification time' in /var/cache/bind of the slave info it gets from the master. A 'ls -lh' shows all the slave files were updated in the last few minutes, and are dated properly. There's no mention of the 'named' directory in the log, and it and all its files were created in 2013 anyway. I set both directories to mod 777 and the ownerships to bind:bind. log: This Bind has the same problem with setting the file modification time -- but only on its single slave file. I don't know if it modifies the master files. I don't see why Bind would be wanting to change the modification time anyway (NSD doesn't); the kernel does a pretty good job of that whenever a file is modified. There are 2 directories in log:/var/cache/bind: masters and slaves. I set /var/cache/bind and the two directories in it to mod 777. Did you by any chance configure bind9 to run in a chroot? No. The next thing I would wonder is if the 'immutable' bit were set on the file system. Again from my system. $ lsattr -d /var/cache/bind /var/cache/bind All three are identical: root@ns2:/var/cache# lsattr -d /var/cache/bind -e-- /var/cache/bind If all else fails I would be inclined to try an experiment. I would open up the permissions on /var/cache/bind to be drwxrwxrwx and then start bind9 and see what files it produces there. The owner and group of the files produced should be a clue. They should be bind but if they were something else that would explain the permission denied message and be a clue as to the problem. service bind9 stop chmod a+rwx /var/cache/bind service bind9 start ls -la /var/cache/bind Done on ns2. Can't set modification time and no tmp files in /var/cache/bind. Modification times look correct, though. On srv, no complaints in the log and no tmp files in the directory. I await with bated breath to see the logs, etc. tomorrow morning... -- Glenn English -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/1f3aff71-02e5-4925-8c8f-06114a8ec...@slsware.net
Re: (should be) simple bind problem
Glenn English wrote: root@srv:~# ps -ef | grep named bind 2098 1 0 May10 ?00:00:36 /usr/sbin/named -u bind root 10498 1 0 May10 ?00:00:50 /usr/sbin/named -c /etc/bind/named.conf There are two of them running? That doesn't seem right. The first one looks okay but the second one does not. I would be inclined to kill both of them. Then start it up again and check all over again. service bind9 stop ps -ef | grep named kill 10498 ps -ef | grep named Make sure none are running. Then start it up again and check. service bind9 start Did you by any chance configure bind9 to run in a chroot? If it isn't that then I would suspect selinux has become enabled but not fully configured. I'm game. How do I find out/configure it? If you haven't heard of it then it isn't enabled. I wouldn't suggest enabling it. If you haven't heard of it then I think it is not likely to be the problem. root@srv:~# ps aux | egrep -i selinux root 13013 0.0 0.0 7828 900 pts/0S+ 15:48 0:00 egrep -i selinux If it's running, it doesn't have a pid. I don't really know what SELinux is. I've heard it's a collection of patches to the kernel, but that's all I know. selinux stands for security-enhanced-linux and is a policy layer where everything is controlled by access control lists. It completely changes the traditional security system. It isn't a daemon. I grepped the /etc/default files for selinux. Nothing. I grepped the /etc/init.d startup files. I found 'selinux-enabled' in the checkroot.sh file (if selinux-enabled ...). selinux-enabled is a small function in /lib/lsb/init-functions.sh: selinux_enabled () { which selinuxenabled /dev/null 21 selinuxenabled } 'which selinuxenabled' says there's no such file here. So does 'root@srv:/boot# find / -iname *selinuxenabled* That command is in the selinux-utils package. You don't have it. Making it unlikely that you would have selinux blocking you. (No need to install it.) The next thing I would wonder is if the 'immutable' bit were set on the file system. Again from my system. $ lsattr -d /var/cache/bind /var/cache/bind (You can read about it in the 'man chattr' man page.) This is happening on Dell, Supermicro, and RaspberryPi boxes, all running Wheezy with default, and updated, kernels, FWIW. The lone Lenny server doesn't seem to have troubles. It happens on multiple systems? Oh my that is a problem. I am afraid I am running out of ideas. If it isn't normal user permissions, isn't selinux, isn't ext immutable then I don't know what it would be. It isn't normal. I am running bind9 on similar random architectures and systems and I have not run into any problems caching files there. If all else fails I would be inclined to try an experiment. I would open up the permissions on /var/cache/bind to be drwxrwxrwx and then start bind9 and see what files it produces there. The owner and group of the files produced should be a clue. They should be bind but if they were something else that would explain the permission denied message and be a clue as to the problem. service bind9 stop chmod a+rwx /var/cache/bind service bind9 start ls -la /var/cache/bind Bob signature.asc Description: Digital signature
Re: (should be) simple bind problem
On May 25, 2015, at 1:00 AM, Bob Proulx b...@proulx.com wrote: Glenn English wrote: root@srv:~# ps -ef | grep named bind 2098 1 0 May10 ?00:00:36 /usr/sbin/named -u bind root 10498 1 0 May10 ?00:00:50 /usr/sbin/named -c /etc/bind/named.conf There are two of them running? That doesn't seem right. The first one looks okay but the second one does not. I would be inclined to kill both of them. Then start it up again and check all over again. Just tried that. They were still 2 of them there. I stopped Bind9 with the init.d script, but the second (root) was still there. So I killed it and started Bind again (init script doesn't 'killall named' I guess??). Now it looks right. FWIW, the one started by root was dated yesterday. I started today's Bind a few minutes ago, and root hasn't started another one yet. I'll look through the logs and look at the rest of your message later. -- Glenn English -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/9f04045a-3398-42e1-9b61-08e3d04c0...@slsware.net
Re: (should be) simple bind problem
Bob Proulx sent me a number of suggestions, and I tested them. Then I inadvertently replied to him instead of the list. Sorry, Bob, and thanks for the ideas. On May 21, 2015, at 3:40 PM, Bob Proulx b...@proulx.com wrote: The first reason that comes to mind for permission denied is that it doesn't have permission. Because the permission is allowed for user and group bind then it follows that the named must be running as a different user rather than the bind user. Therefore the -u bind option must have been removed. $ grep OPTIONS /etc/default/bind9 OPTIONS=-u bind root@srv:~/init.d# egrep -i options /etc/default/bind9 # startup options for the server OPTIONS=-u bind Nope. $ ps -ef | grep named bind 2257 1 0 May20 ?00:00:27 /usr/sbin/named -u bind root@srv:~# ps -ef | grep named bind 2098 1 0 May10 ?00:00:36 /usr/sbin/named -u bind root 10498 1 0 May10 ?00:00:50 /usr/sbin/named -c /etc/bind/named.conf $ id bind uid=107(bind) gid=115(bind) groups=115(bind) root@srv:~# id bind uid=104(bind) gid=107(bind) groups=107(bind) Has the -u bind option been removed and the daemon is therefore running as a different user id? After doing your tests, I really don't think so. But I don't know if ps' line root 10498 1 0 May10 ?00:00:50 /usr/sbin/named -c /etc/bind/named.conf means anything. Looks like it might be about one of today's many restarts. If it isn't that then I would suspect selinux has become enabled but not fully configured. I'm game. How do I find out/configure it? root@srv:~# ps aux | egrep -i selinux root 13013 0.0 0.0 7828 900 pts/0S+ 15:48 0:00 egrep -i selinux If it's running, it doesn't have a pid. I don't really know what SELinux is. I've heard it's a collection of patches to the kernel, but that's all I know. I grepped the /etc/default files for selinux. Nothing. I grepped the /etc/init.d startup files. I found 'selinux-enabled' in the checkroot.sh file (if selinux-enabled ...). selinux-enabled is a small function in /lib/lsb/init-functions.sh: selinux_enabled () { which selinuxenabled /dev/null 21 selinuxenabled } 'which selinuxenabled' says there's no such file here. So does 'root@srv:/boot# find / -iname *selinuxenabled* I grepped the kernel config file in /boot: root@srv:/boot# egrep -i selinux config-3.2.0-4-amd64 CONFIG_SECURITY_SELINUX=y # CONFIG_SECURITY_SELINUX_BOOTPARAM is not set # CONFIG_SECURITY_SELINUX_DISABLE is not set CONFIG_SECURITY_SELINUX_DEVELOP=y CONFIG_SECURITY_SELINUX_AVC_STATS=y CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1 # CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX is not set # CONFIG_DEFAULT_SECURITY_SELINUX is not set I don't know enough about the kernel to know what those lines mean. Something SELinux seems to be included in the compile, but I haven't been able to find much of a trace of it. (The NSA's good at keeping secrets :-) You did come up with some new and exciting things that might be bent, but they seem to be OK. And as best I can tell, there's just a hint of SELinux on this machine. This is happening on Dell, Supermicro, and RaspberryPi boxes, all running Wheezy with default, and updated, kernels, FWIW. The lone Lenny server doesn't seem to have troubles. -- Glenn English -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/690a8994-2a57-4851-9450-67ee863ea...@slsware.net
Re: (should be) simple bind problem
Glenn English wrote: I'm getting (and have been for a while) log entries from my slave nameservers like: dumping master file: /var/cache/bind/tmp-0EIP3LrP0G: open: permission denied ... drwxrwxr-x 2 bind bind 4096 May 21 10:09 /var/cache/bind/ Good. Any ideas? The first reason that comes to mind for permission denied is that it doesn't have permission. Because the permission is allowed for user and group bind then it follows that the named must be running as a different user rather than the bind user. Therefore the -u bind option must have been removed. $ grep OPTIONS /etc/default/bind9 OPTIONS=-u bind $ ps -ef | grep named bind 2257 1 0 May20 ?00:00:27 /usr/sbin/named -u bind $ id bind uid=107(bind) gid=115(bind) groups=115(bind) The numbers above are not significant and depend upon the system. Your numbers will be different from this example. It is only important that bind shows up in all three places and not some other name. Has the -u bind option been removed and the daemon is therefore running as a different user id? If it isn't that then I would suspect selinux has become enabled but not fully configured. Bob signature.asc Description: Digital signature
(should be) simple bind problem
I'm getting (and have been for a while) log entries from my slave nameservers like: dumping master file: /var/cache/bind/tmp-0EIP3LrP0G: open: permission denied I also see problems with updating modification times of incoming files from masters. Debian Wheezy, Bind9 There are hundreds of discussions of this problem on the 'Net, and as one of them says, I've tried them all. Most had to do with fixing named.conf* and permissions on directories: root@srv:~# egrep directory /etc/bind/named.conf.options directory /var/cache/bind; root@srv:~# ls -ld /var drwxr-xr-x 12 root root 4096 Jul 15 2014 /var root@srv:~# ls -ld /var/cache/ drwxr-xr-x 16 root root 4096 Oct 11 2014 /var/cache/ root@srv:~# ls -ld /var/cache/bind/ drwxrwxr-x 2 bind bind 4096 May 21 10:09 /var/cache/bind/ Permissions and directories look OK to me. I gave user bind a password and a live shell, logged in, and: root@srv:~# su - bind bind@srv:~$ pwd /var/cache/bind bind@srv:~$ touch /var/cache/bind/tmp-0EIP3LrP0G bind@srv:~$ ls -lh /var/cache/bind/tmp-0EIP3LrP0G -rw-r--r-- 1 bind bind 0 May 21 12:54 /var/cache/bind/tmp-0EIP3LrP0G It seems to be able to create files. I added 'bind' to my groups and: ghe@srv:~$ touch /var/cache/bind/test ghe@srv:~$ ls -lh /var/cache/bind/test -rw-r--r-- 1 ghe ghe 0 May 21 13:25 /var/cache/bind/test One interesting fix I saw involved SELinux; it said that -- I've been at this for a while, so details are fuzzy -- SELinux changes Bind functionality so it can't write some things. But the solution involved sesetbool (approximately; a program to set boolean vars in SELinux) and according to bash and man, the executable doesn't exist on my servers. I can see traces of SELinux here, but nothing I can figure out how to look at. None of my other server software has this problem, just Bind. Any ideas? -- Glenn English -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/8390d5f5-4450-498d-a25a-5da5c16f1...@slsware.net