We upgraded our systems to RedHat 7.2 with glibc2.2.4
and all of a sudden, our mod_perl scripts which call
readdir() would either fail with an exception, or
readdir() in list context would return the correct # of
items but each item as an empty string.

We scanned the modperl list and found about 50 posts in
about 8 threads on this topic (!) but no answers.

There were various workaround suggestions, such as
using glob(), but these were not useful to us since we
rely on a number of third-party perl modules which call
readdir() and it was not practical to modify them.

We discovered that this problem is caused by a build
mismatch between perl and mod_perl that is a result of
a nasty binary incompatibility between glibc2.2.4 and
glibc2.1.3.  So much for DSO versioning!

To get rid of the problem, you _must_ rebuild perl from
scratch on a system with glibc2.2.4 (in our case we had
perl 5.6.1), then you _must_ rebuild mod_perl from
scratch on the same system (we had mod_perl 1.26) and
then build and run httpd on the same system.

(ok, technically it doesn't have to be the same system,
but you must have the same glibc version installed
during each build and also when you run httpd).

Or if you're not the building type, you'll need to
locate some pre-done build that was done entirely
on a system with glibc2.2.4.

The analysis of the problem goes something like this:

- the code which is hurt by the glibc incompatibility
is the code that actually calls the C readdir()
function.  it gets built into
/opt/perl/lib/5.6.1/i686-linux/CORE/libperl.a by the
perl build.

- that code is then statically linked into mod_perl by
the mod_perl build (regardless of whether mod_perl
itself is statically or dynamically linked to httpd).

- when you build libperl.a on RedHat 6 with glibc2.1.3,
struct dirent is 272 bytes long and dirent.d_name is at
offset 15.

- when you build libperl.a on RedHat 7 with glibc2.2.4,
struct dirent is 276 bytes long and dirent.d_name is at
offset 19.

- a compile time / runtime discrepancy here could
definitely cause the symptom we're seeing (right number
of files, but empty filenames) because the struct is
like this:

struct dirent {
    long        d_ino;
    __kernel_off_t  d_off;
    unsigned short  d_reclen;
    char        d_name[256];
};

If glibc2.2 placed the name at the correct place, but
libperl.a was reading it out 4 bytes early, then
libperl.a would probably see a byte of 0 as the first
byte of the name, hence looking like an empty C string.

We say this because d_off is probably either 0 all the
time (since the man page says "Use of other fields
[including d_off] will harm the portability of your
programs,") or it's probably 0 because it's the third
byte of d_off and you'd have to have a directory that
was 256*256 bytes or more in order to get a nonzero
value for d_off on a little endian machine.  So really
this bug discriminates against people with a lot of
files or big-endian machines :) We could test this
theory by creating large directories but we're happy
that rebuilding it works :)


_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp.

Reply via email to