Jeremy Chadwick wrote:
On Mon, Nov 12, 2007 at 09:21:56PM +0100, Erik Stian Tefre wrote:
There seems to be a bug (or feature?) somewhere that limits the number of unique temporary file names used when storing temporary files that are uploaded by posting a form. Looking through my webserver logs of 110000 file uploads, I find no more than 495 unique temporary file names which are being reused again and again.
(File name example: /var/tmp/phpzzJuIt)

I think PHP is supposed to use mkstemp(). From the mkstemp(3) manual:
"The number of unique file names mktemp() can return depends on the number of `Xs' provided; six `Xs' will result in mktemp() selecting one of 56800235584 (62 ** 6) possible temporary file names."

PHP uses 6 Xs. This makes the low number of observed unique file names (495) a bit disappointing.

It sounds as if the limitation in range (56800235584 vs. 495) may be due
to what's considered a permittable character in a filename.  I'm betting
the function ANDs the per-byte results, requiring them to be within
[0-9A-Za-z].  That's (26+26+10)^6.

(26+26+10)^6 = 62^6 = 56800235584. So I guess the limited permittable characters are already accounted for in the manual...?

Based on that, it sounds as if there's no "easy" way to increase the
entropy.

I'm not really sure I'd use gettimeofday() for extending this, though.
If I remember correctly (someone please correct me if I'm wrong):

* The clock is not a good source of randomness because it's predictable
  (although in this case it's not the sole source of entropy)

My main concern is random file name collisions, not the predictability of file names. The clock fixes the collision problem. But I guess predictable file names may be a security problem for some applications.

* gettimeofday() is an expensive call due to communication with the RTC.

Probably not too expensive when compared to the time and resources used for handling the uploaded file in the filesystem etc.

#include <sys/time.h>
int main (int argc, char ** argv) {
        struct timeval tval;
        int i;
        for (i = 0; i < 1000000; i++) {
                gettimeofday(&tval, NULL);
                printf("%d %d\n", tval.tv_sec, tval.tv_usec);
        }
        exit(0);
}

%time ./gettime > /dev/null
0.492u 5.824s 0:08.06 78.2%     5+190k 0+0io 0pf+0w

Which is about 125k gettimeofday()s per second (including the useless printf()).

I'm left believing that adding more X's to the path passed to mkstemp()
would be a better solution, and a more compatible one.

If mkstemp() was behaving as expected and according to the docs, I would agree. But it isn't, so I would not be surprised if I found no more than 495 longer filenames being reused after adding more Xs. ;-)

I'd like to find the real reason for the limited number of unique filenames. Maybe it's related to how mkstemp() or its random number generator arc4random(3) is used by php and/or apache?

--
Erik
_______________________________________________
freebsd-ports@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ports
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to