We upgraded our systems to RedHat 7.2 with glibc2.2.4 and all of a sudden, our mod_perl scripts which call readdir() would either fail with an exception, or readdir() in list context would return the correct # of items but each item as an empty string.
We scanned the modperl list and found about 50 posts in about 8 threads on this topic (!) but no answers. There were various workaround suggestions, such as using glob(), but these were not useful to us since we rely on a number of third-party perl modules which call readdir() and it was not practical to modify them. We discovered that this problem is caused by a build mismatch between perl and mod_perl that is a result of a nasty binary incompatibility between glibc2.2.4 and glibc2.1.3. So much for DSO versioning! To get rid of the problem, you _must_ rebuild perl from scratch on a system with glibc2.2.4 (in our case we had perl 5.6.1), then you _must_ rebuild mod_perl from scratch on the same system (we had mod_perl 1.26) and then build and run httpd on the same system. (ok, technically it doesn't have to be the same system, but you must have the same glibc version installed during each build and also when you run httpd). Or if you're not the building type, you'll need to locate some pre-done build that was done entirely on a system with glibc2.2.4. The analysis of the problem goes something like this: - the code which is hurt by the glibc incompatibility is the code that actually calls the C readdir() function. it gets built into /opt/perl/lib/5.6.1/i686-linux/CORE/libperl.a by the perl build. - that code is then statically linked into mod_perl by the mod_perl build (regardless of whether mod_perl itself is statically or dynamically linked to httpd). - when you build libperl.a on RedHat 6 with glibc2.1.3, struct dirent is 272 bytes long and dirent.d_name is at offset 15. - when you build libperl.a on RedHat 7 with glibc2.2.4, struct dirent is 276 bytes long and dirent.d_name is at offset 19. - a compile time / runtime discrepancy here could definitely cause the symptom we're seeing (right number of files, but empty filenames) because the struct is like this: struct dirent { long d_ino; __kernel_off_t d_off; unsigned short d_reclen; char d_name[256]; }; If glibc2.2 placed the name at the correct place, but libperl.a was reading it out 4 bytes early, then libperl.a would probably see a byte of 0 as the first byte of the name, hence looking like an empty C string. We say this because d_off is probably either 0 all the time (since the man page says "Use of other fields [including d_off] will harm the portability of your programs,") or it's probably 0 because it's the third byte of d_off and you'd have to have a directory that was 256*256 bytes or more in order to get a nonzero value for d_off on a little endian machine. So really this bug discriminates against people with a lot of files or big-endian machines :) We could test this theory by creating large directories but we're happy that rebuilding it works :) _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp.