Re: Wrong file readdir on NFS client
On Wed, Mar 20, 2013 at 1:22 AM, Antonietta Donzella antonietta.donze...@ing.unibs.it wrote: Hi, I share directories on a Scientific linux cluster by using nfs tool. SLC6 kernel 2.6.32-279.22.1.el6.x86_64 nfs-utils-1.2.3-26.el6.x86_64 After a server-client shutdown, a disagreeable event occurred. By making ls list on a nfs client shared directory, duplicated entries for some files are shown. Other files and sub-directories are non visible. The problem is not present on the nfs server. The directory don't contain an enormous number of files: On the server: #ls |wc -l 330 #du -s 8855740 On the client: #ls |wc -l ls: reading directory .: Too many levels of symbolic links 120 n.b. there are only two symbolic links and they are not changed after the shutdown; however, the problem is not cleared if I remove them. #du -s not responding #dmesg some entries: ... NFS: directory images/dp contains a readdir loop.Please contact your server vendor. The file: .. has duplicate cookie 683570819 NFS: directory images/dp contains a readdir loop.Please contact your server vendor. The file: .. has duplicate cookie 683570819 __ratelimit: 2 callbacks suppressed NFS: directory images/green contains a readdir loop.Please contact your server vendor. The file: 57.png has duplicate cookie 1694199390 NFS: directory images/green contains a readdir loop.Please contact your server vendor. The file: 57.png has duplicate cookie 1694199390 ... I've tried to boot the system with the older kernel-2.6.32-279.el6.x86_64, and with the new kernel 2.6.32-358.2.1.el6.x86_64 but the bug is not cleared. Any ideas? Many thanks in advance Antonietta Looks like you are affected by a known bug. It is probably the same as this one: http://bugs.centos.org/view.php?id=6241 If so, there is no fix at the moment unfortunately. Akemi
Re: Security ERRATA Low: ipa on SL6.x i386/x86_64
Hi I cant configure ipa to as dns, please see bottom. On 03/04/2013 09:09 PM, Pat Riehecky wrote: Synopsis: Low: ipa security, bug fix and enhancement update Issue Date: 2013-02-21 CVE Numbers: CVE-2012-4546 -- It was found that the current default configuration of IPA servers did not publish correct CRLs (Certificate Revocation Lists). The default configuration specifies that every replica is to generate its own CRL; however, this can result in inconsistencies in the CRL contents provided to clients from different Identity Management replicas. More specifically, if a certificate is revoked on one Identity Management replica, it will not show up on another Identity Management replica. (CVE-2012-4546) -- SL6 x86_64 ipa-client-3.0.0-25.el6.x86_64.rpm ipa-debuginfo-3.0.0-25.el6.x86_64.rpm ipa-python-3.0.0-25.el6.x86_64.rpm ipa-admintools-3.0.0-25.el6.x86_64.rpm ipa-server-3.0.0-25.el6.x86_64.rpm ipa-server-selinux-3.0.0-25.el6.x86_64.rpm ipa-server-trust-ad-3.0.0-25.el6.x86_64.rpm i386 ipa-client-3.0.0-25.el6.i686.rpm ipa-debuginfo-3.0.0-25.el6.i686.rpm ipa-python-3.0.0-25.el6.i686.rpm ipa-admintools-3.0.0-25.el6.i686.rpm ipa-server-3.0.0-25.el6.i686.rpm ipa-server-selinux-3.0.0-25.el6.i686.rpm ipa-server-trust-ad-3.0.0-25.el6.i686.rpm The following packages were added for dependency resolution SL6 x86_64 certmonger-0.61-3.el6.x86_64.rpm mod_nss-1.0.8-18.el6.x86_64.rpm nss-3.14.0.0-12.el6.i686.rpm nss-3.14.0.0-12.el6.x86_64.rpm nss-devel-3.14.0.0-12.el6.i686.rpm nss-devel-3.14.0.0-12.el6.x86_64.rpm nss-pkcs11-devel-3.14.0.0-12.el6.i686.rpm nss-pkcs11-devel-3.14.0.0-12.el6.x86_64.rpm nss-sysinit-3.14.0.0-12.el6.x86_64.rpm nss-tools-3.14.0.0-12.el6.x86_64.rpm nss-util-3.14.0.0-2.el6.i686.rpm nss-util-3.14.0.0-2.el6.x86_64.rpm nss-util-devel-3.14.0.0-2.el6.i686.rpm nss-util-devel-3.14.0.0-2.el6.x86_64.rpm policycoreutils-2.0.83-19.24.el6.x86_64.rpm policycoreutils-gui-2.0.83-19.24.el6.x86_64.rpm policycoreutils-newrole-2.0.83-19.24.el6.x86_64.rpm policycoreutils-python-2.0.83-19.24.el6.x86_64.rpm policycoreutils-sandbox-2.0.83-19.24.el6.x86_64.rpm i386 certmonger-0.61-3.el6.i686.rpm mod_nss-1.0.8-18.el6.i686.rpm nss-3.14.0.0-12.el6.i686.rpm nss-devel-3.14.0.0-12.el6.i686.rpm nss-pkcs11-devel-3.14.0.0-12.el6.i686.rpm nss-sysinit-3.14.0.0-12.el6.i686.rpm nss-tools-3.14.0.0-12.el6.i686.rpm nss-util-3.14.0.0-2.el6.i686.rpm nss-util-devel-3.14.0.0-2.el6.i686.rpm policycoreutils-2.0.83-19.24.el6.i686.rpm policycoreutils-gui-2.0.83-19.24.el6.i686.rpm policycoreutils-newrole-2.0.83-19.24.el6.i686.rpm policycoreutils-python-2.0.83-19.24.el6.i686.rpm policycoreutils-sandbox-2.0.83-19.24.el6.i686.rpm I think bind-dyndb-ldap-2.3.2 needs to be added to that dependency list. On attempting to configure ipa-server-3.0.0 for dns it complains the bind-dyndb-ldap is not installed. On installing it says it needs 2.3.2 but only 1.1.0-0.9.b1.el6_3.1 is available. It is however available in 6.4 though, where 3.0.0 will happily run more than likely. Although the source packages http://ftp.scientificlinux.org/linux/scientific/6.4/SRPMS/vendor/bind-dyndb-ldap-1.1.0-0.9.b1.el6_3.1.src.rpm is the latest but http://ftp.scientificlinux.org/linux/scientific/6.4/i386/os/Packages/bind-dyndb-ldap-2.3-2.el6.i686.rpm I cant find the src to build it myself. There was mention of a similar problem in the transition from 6.1 to 6.2 at http://listserv.fnal.gov/scripts/wa.exe?A2=ind1201L=scientific-linux-usersT=0P=6283 Must I simply wait for 6.4 ? Thanks Sean - Scientific Linux Development Team smime.p7s Description: S/MIME Cryptographic Signature
Re: [SCIENTIFIC-LINUX-USERS] Security ERRATA Low: ipa on SL6.x i386/x86_64
On 03/20/2013 03:42 PM, Pat Riehecky wrote: On 03/20/2013 08:41 AM, Sean Murray wrote: Hi I cant configure ipa to as dns, please see bottom. I'm pushing the updated bind-dyndb-ldap package at this time. It should be available in the next 45 minutes. Awesome thanks for the amazingly quick turnaround. Cheers Sean Pat
Re: Wrong file readdir on NFS client
On Wed, 20 Mar 2013, Antonietta Donzella wrote: Hi, I share directories on a Scientific linux cluster by using nfs tool. SLC6 kernel 2.6.32-279.22.1.el6.x86_64 nfs-utils-1.2.3-26.el6.x86_64 After a server-client shutdown, a disagreeable event occurred. By making ls list on a nfs client shared directory, duplicated entries for some files are shown. Other files and sub-directories are non visible. The problem is not present on the nfs server. The directory don't contain an enormous number of files: On the server: #ls |wc -l 330 #du -s 8855740 On the client: #ls |wc -l ls: reading directory .: Too many levels of symbolic links 120 n.b. there are only two symbolic links and they are not changed after the shutdown; however, the problem is not cleared if I remove them. #du -s not responding #dmesg some entries: ... NFS: directory images/dp contains a readdir loop.Please contact your server vendor. The file: .. has duplicate cookie 683570819 NFS: directory images/dp contains a readdir loop.Please contact your server vendor. The file: .. has duplicate cookie 683570819 __ratelimit: 2 callbacks suppressed NFS: directory images/green contains a readdir loop.Please contact your server vendor. The file: 57.png has duplicate cookie 1694199390 NFS: directory images/green contains a readdir loop.Please contact your server vendor. The file: 57.png has duplicate cookie 1694199390 ... I've tried to boot the system with the older kernel-2.6.32-279.el6.x86_64, and with the new kernel 2.6.32-358.2.1.el6.x86_64 but the bug is not cleared. Any ideas? Many thanks in advance Antonietta There is more info on this at http://thread.gmane.org/gmane.comp.file-systems.ext4/37022 -Connie Sieh
Re: Wrong file readdir on NFS client
On Wed, Mar 20, 2013 at 01:51:23PM -0500, Connie Sieh wrote: There is more info on this at http://thread.gmane.org/gmane.comp.file-systems.ext4/37022 Thanks. Good reading to refresh memory. Deja-vu all over again, as they say. But now I forget what was the last time of trouble with 64-bit readdir cookies. Was it NFS+XFS? Or NFSv2 vs NFSv3 vs glibc with some programs stealing the extra bits? I remember the solution was to only return 32-bit readdir cookies (31-bit, as glibc steals one bit). This time the ext4 people are back with 64-bit cookies, only to step into the same doo-doo... -- Konstantin Olchanski Data Acquisition Systems: The Bytes Must Flow! Email: olchansk-at-triumf-dot-ca Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada
Re: [SCIENTIFIC-LINUX-USERS] Wrong file readdir on NFS client
On 03/20/2013 07:55 AM, Akemi Yagi wrote: On Wed, Mar 20, 2013 at 1:22 AM, Antonietta Donzella antonietta.donze...@ing.unibs.it wrote: Hi, I share directories on a Scientific linux cluster by using nfs tool. SLC6 kernel 2.6.32-279.22.1.el6.x86_64 nfs-utils-1.2.3-26.el6.x86_64 After a server-client shutdown, a disagreeable event occurred. By making ls list on a nfs client shared directory, duplicated entries for some files are shown. Other files and sub-directories are non visible. The problem is not present on the nfs server. The directory don't contain an enormous number of files: On the server: #ls |wc -l 330 #du -s 8855740 On the client: #ls |wc -l ls: reading directory .: Too many levels of symbolic links 120 n.b. there are only two symbolic links and they are not changed after the shutdown; however, the problem is not cleared if I remove them. #du -s not responding #dmesg some entries: ... NFS: directory images/dp contains a readdir loop.Please contact your server vendor. The file: .. has duplicate cookie 683570819 NFS: directory images/dp contains a readdir loop.Please contact your server vendor. The file: .. has duplicate cookie 683570819 __ratelimit: 2 callbacks suppressed NFS: directory images/green contains a readdir loop.Please contact your server vendor. The file: 57.png has duplicate cookie 1694199390 NFS: directory images/green contains a readdir loop.Please contact your server vendor. The file: 57.png has duplicate cookie 1694199390 ... I've tried to boot the system with the older kernel-2.6.32-279.el6.x86_64, and with the new kernel 2.6.32-358.2.1.el6.x86_64 but the bug is not cleared. Any ideas? Many thanks in advance Antonietta Looks like you are affected by a known bug. It is probably the same as this one: http://bugs.centos.org/view.php?id=6241 If so, there is no fix at the moment unfortunately. Akemi I've heard an rumor that disabling dir_index on the 'EXT' family of filesystems will work around the behavior. This information is presented without recommendation. Pat -- Pat Riehecky Scientific Linux developer http://www.scientificlinux.org/
A silly question
This is perhaps a silly question, but I would appreciate a URL or some other explanation. A faculty colleague and I were discussing the differences between a supported enterprise Linux and any of a number of beta or enthusiast linuxes (including TUV Fedora). A question arose for which I have no answer: why did SL -- that has professional paid personnel at Fermilab and CERN -- select to use the present TUV instead of SuSE enterprise that is RPM (but yast, not yum) based, and has to release full source (not binaries/directly useable) for the OS environment under the same conditions as TUV of SL? SuSE is just as stable, but typically incorporates more current versions of applications and libraries than does the TUV chosen. Any insight would be appreciated. If SuSE had been chosen (SuSE originally was from the EU and thus a more natural choice for CERN), what would we be losing over SL? To the best of my knowledge, there is no SuSE Enterprise clone equivalent to the SL or CentOS clones of TUV EL. Yasha Karant
Re: A silly question
Well this is sort of a question I answer at work all the time so I can tell you and I know there are sites and even Linux journal articles that explain it.Essentially both labs had their own in-house compiled versions of RHEL already for slightly different reasons but CERNs was called LTS (long term support) Linux and their original goal was to keep doing security patches to older RHEL versions after Redhat declared EOL ( End Of Life) on earlier versions of RHEL because their were essentially appliances built for labs that were it was difficult to migrate the apps to newer versions of RHEL and at the time Redhat only provided patches for a version for about 2 years for a version of RHEL if I remember correctly. The problem is when you install something in a facility connected directly or by proximity with less that two firewalls in between to a secure US government facility it must have all security patches for any installed software within a few months of the creation of the fix for the security hole. Also every new version of any OS needs to be evaluated for security prior to being connected. So for CERN since so many US Government agencies already used RHEL, and the time its was so popular in the US, that any one in the US who knew linux had used Reheat at some point; it was really the only choice.As a matter of fact I can remember in the late 90s being so synotimous with linux in the US that I was having a problem with compiling a program due to a Redhat only bug caused by a patch they put into gcc so I ran into 4 different software stores asking if they had any linux distro other than redhat, the first three store I was told no the 4th store told me yea we have plenty and then walked me over to a wall filled floor to sealing of various different redhat (box set v5.x pre RHEL) box sets with various different support add-ons like the "secure webserver" version that included a script on an additional 3.5 floppy to set up a openssl CA for you, but the were all redhat.Fermilabs motivation to choose RHEL over SuSE I'm not sure of but I suspect since they are funded by multiple countries and the nature of their research they may have also run into the US Government security rules and its just easier in that case to go with the flow than deal with the long drawn out process of getting a different distro certified.-- Sent from my HP Pre3On Mar 21, 2013 12:11 AM, Yasha Karant ykar...@csusb.edu wrote: This is perhaps a silly question, but I would appreciate a URL or some other explanation. A faculty colleague and I were discussing the differences between a supported enterprise Linux and any of a number of "beta" or "enthusiast" linuxes (including TUV Fedora). A question arose for which I have no answer: why did SL -- that has professional paid personnel at Fermilab and CERN -- select to use the present TUV instead of SuSE enterprise that is RPM (but yast, not yum) based, and has to release full source (not binaries/directly useable) for the OS environment under the same conditions as TUV of SL? SuSE is just as stable, but typically incorporates more current versions of applications and libraries than does the TUV chosen. Any insight would be appreciated. If SuSE had been chosen (SuSE originally was from the EU and thus a more natural choice for CERN), what would we be losing over SL? To the best of my knowledge, there is no SuSE Enterprise clone equivalent to the SL or CentOS clones of TUV EL. Yasha Karant