This is a patch against src/backend/storage/file/fd.c taken from 9.2beta1. This patch is submitted for review and comments, not for application to the code base. *WIP*
This patch addresses a performance problem stemming from the use of FindFirstFile() and FindNextFile() to iterate over a directory in Windows. These two functions are used in the port of readdir for Windows. Unfortunately, unlike Linux, these Windows directory iteration functions return the equivalent of a stat() call for each file iterated. Hence, if a directory contains many tens of thousands of files, the iteration can take several minutes to complete. In RemovePgTempFile(), multiple directories are iterated and all files found which match the pattern for a temporary file are unlinked. The pattern matching is performed *outside* the directory iteration. This patch uses a file pattern like "t*" to match all temporary files, rather than iterating over all files in the directory, thus pushing the pattern match *inside* the directory iteration and gaining significant startup time performance. This is not theoretical. The real-world database where I found this problem is on a Windows 2003 server running PostgreSQL 9.1.3 and having 56,000 tables. I was able to duplicate the problem on a Windows 2008 server. To reproduce, you will need a database on Windows with tens of thousands of tables and a recent version of PostgreSQL. Reboot the Windows server so that the filesystem is guaranteed not to be in the filesystem cache. Start postgres using pg_ctl, and note that it takes several minutes to start. After applying the patch and re-running these steps, the server should not take so long to start. I have the following reservations about my design, and solicit comments and suggestions for improvement: 1) The changes I made in fd.c pass a pattern rather than a name into ReadDir *knowing the details* of how ReadDir on Windows will use the port of readdir in src/port/dirent.c and that in that code FindFirstFile() and FindNextFile() will be called. This knowledge about the inner workings of the port of readdir() is not appropriate inside fd.c, IMHO. 2) I used a fair amount of #ifdef WIN32 to avoid adding unnecessary variables or branches to the non-windows code. Since this code is probably not on the critical path performancewise, this may be overkill. 3) The pattern passed to ReadDir of the form "t*" should probably be something closer to (in pcre form): m/^t\d+_\d+/, rather than m/^t.*/. I am not sufficiently familiar with how Windows interprets file patterns, and whether it interprets them differently from one version of Windows to another, to be comfortable making a more precise pattern. 4) Other places in the PostgreSQL sources where directory iteration is needed should probably use a pattern if possible when running on Windows. Thus, it might make more sense to have a version of ReadDir that explicitly takes a pattern, and use that version of ReadDir elsewhere in the codebase.
fd.diffs
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers