> >                 if (r->finfo.filetype == APR_REG
> >                     && (!r->path_info || !*r->path_info) )

> Possible stupid question, but why do we need this path_info check?

ap_directory_walk uses r->path_info to hold the portion of the path that has
not yet been walked.  The check on r->path_info tells us if we are at "the
end of the line", i.e., if there are no more path segments to append and
check.  Note that the loop is a little warped: it checks htaccess, appends
the next segment, lstat's the new appended path, then loops back to check
htaccess on the appended path, etc.  So at the time we are doing the
optimization, we are working with the path that has just had the next
segment appended and has not yet been checked for anything.

This inner if statement says "if the full path points to a regular file and
if we have no more path segments to check after this one, then break the
loop."  It is in part the same test that is done at line 887--the comment
there reads "Time for all good things to come to an end?".

Having explained the code, I realize that the test is not exactly what it
should be.  It does not handle the case of being at "the end of the line"
with an unknown file type.

To handle this case, the code should read as follows:

            if (r->finfo.filetype
#ifdef CASE_BLIND_FILESYSTEM
                && (filename_len <= canonical_len)
#endif
                && (opts.opts & OPT_SYM_LINKS) )
            {
                ++seg;

                if (r->path_info && *r->path_info)
                {
                    thisinfo.filetype = APR_DIR;
                    continue;
                }
                else if (r->finfo.filetype == APR_REG)
                {
                      thisinfo.filetype = APR_REG;
                    break;
                }
            }

The inner if's now say:

1. If we have more path segments to check after this one, treat this segment
as a directory and continue with the path walk.

2. Else, we have no more path segment to check after this one, therefore:

(a) If the full path we are checking is a regular file, we are done with all
checks and can break from the loop

(b) else, treat the current path as unknown and continue on with the lstat.

It should be noted that this optimization is not completely optimal with a
filesystem that is not case sensitive like Windows.  With
CASE_BLIND_FILESYSTEM, the computation of canonical_len compares
r->canonical_filename to r->path_info, then deducts the filename
("file.html"). As a result, canonical_len will always be less than the full
path length and the test (filename_len <= canonical_len) will be false once
the file name itself ("file.html") is appended to r->filename.  The net
result is that the final lstat will always be performed a second time on the
full path length.  This could be fixed, but since I'm not sure exactly how
all the canonical_filename stuff is used, I'm not going to touch it.

The Windows code has more problems than that though, since it actually walks
and stats each component in the path twice (see file I/O log included
below.)  It does it a first time in
ap_process_request_internal->ap_run_translate_name->ap_core_translate->apr_f
ilepath_merge (win32 version) which includes an apr_stat on each subdir in
path until it finds a file.  Then it stats each component in the path a
second time in ap_directory_walk.  The unix version of apr_filepath_merge
does not contain an apr_stat so this is not a problem.

> Could you please provide it as a unified diff against trunk?

Attached and included inline, a diff -u against v2.2.4.  It should apply
fine to the trunk since the relevant code has not changed.

--- .svn/text-base/request.c.svn-base   2007-06-04 23:40:23.801630400 -0400
+++ request.c   2007-06-13 18:25:27.104542400 -0400
@@ -931,13 +931,21 @@
 #ifdef CASE_BLIND_FILESYSTEM
                 && (filename_len <= canonical_len)
 #endif
-                && ((opts.opts & (OPT_SYM_OWNER | OPT_SYM_LINKS)) ==
OPT_SYM_LINKS))
+                && (opts.opts & OPT_SYM_LINKS) )
             {
+                ++seg;

+                if (r->path_info && *r->path_info)
+                {
                 thisinfo.filetype = APR_DIR;
-                ++seg;
                 continue;
             }
+                else if (r->finfo.filetype == APR_REG)
+                {
+                    thisinfo.filetype = APR_REG;
+                    break;
+                }
+            }

             /* We choose apr_stat with flag APR_FINFO_LINK here, rather
that
              * plain apr_stat, so that we capture this path object rather
than


------------------

Effect of patch

Configuration:

docroot: /usr/local/apache/htdocs
AllowOverrides and FollowSymLinks are enabled

request: http://localhost/dir/subdir/file.html
This file exists, and there is an .htaccess in the docroot


Linux file I/O:

[pid 10755] stat64("/usr/local/apache2/htdocs/dir/subdir/file.html",
{st_mode=S_IFREG|0644, st_size=48, ...}) = 0
[pid 10755] open("/usr/local/apache2/htdocs/.htaccess",
O_RDONLY|O_LARGEFILE) = 12
[pid 10755] open("/usr/local/apache2/htdocs/dir/.htaccess",
O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
[pid 10755] open("/usr/local/apache2/htdocs/dir/subdir/.htaccess",
O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
[pid 10755] open("/usr/local/apache2/htdocs/dir/subdir/file.html",
O_RDONLY|O_LARGEFILE) = 12


Windows file I/O, with annotations:

The following file I/O is from
ap_process_request_internal
 ap_run_translate_name
  ap_core_translate
   apr_filepath_merge (win32 version)

OPEN    C:\usr\local\apache2\htdocs\    SUCCESS Options: Open Directory
Access: 00100001
DIRECTORY       C:\usr\local\apache2\htdocs\    SUCCESS
FileBothDirectoryInformation: dir
CLOSE   C:\usr\local\apache2\htdocs\    SUCCESS 
OPEN    C:\usr\local\apache2\htdocs\dir\        SUCCESS Options: Open
Directory  Access: 00100001
DIRECTORY       C:\usr\local\apache2\htdocs\dir\        SUCCESS
FileBothDirectoryInformation: subdir
CLOSE   C:\usr\local\apache2\htdocs\dir\        SUCCESS 
OPEN    C:\usr\local\apache2\htdocs\dir\subdir\ SUCCESS Options: Open
Directory  Access: 00100001
DIRECTORY       C:\usr\local\apache2\htdocs\dir\subdir\ SUCCESS
FileBothDirectoryInformation: file.html
CLOSE   C:\usr\local\apache2\htdocs\dir\subdir\ SUCCESS 

The following file I/O is from
ap_run_map_to_storage
 core_map_to_storage
  ap_directory_walk

OPEN    C:\usr\local\apache2\htdocs\dir\subdir\file.html        SUCCESS
Options: Open  Access: 00100080
QUERY INFORMATION       C:\usr\local\apache2\htdocs\dir\subdir\file.html
BUFFER OVERFLOW FileFsVolumeInformation
QUERY INFORMATION       C:\usr\local\apache2\htdocs\dir\subdir\file.html
BUFFER OVERFLOW FileAllInformation
CLOSE   C:\usr\local\apache2\htdocs\dir\subdir\file.html        SUCCESS 
OPEN    C:\usr\local\apache2\htdocs\.htaccess   SUCCESS Options: Open
Access: Read
QUERY INFORMATION       C:\usr\local\apache2\htdocs\.htaccess   BUFFER
OVERFLOW        FileFsVolumeInformation
QUERY INFORMATION       C:\usr\local\apache2\htdocs\.htaccess   BUFFER
OVERFLOW        FileAllInformation
READ    C:\usr\local\apache2\htdocs\.htaccess   SUCCESS Offset: 0 Length:
4096
READ    C:\usr\local\apache2\htdocs\.htaccess   END OF FILE     Offset: 63
Length: 4096
READ    C:\usr\local\apache2\htdocs\.htaccess   END OF FILE     Offset: 63
Length: 4096
CLOSE   C:\usr\local\apache2\htdocs\.htaccess   SUCCESS 
OPEN    C:\usr\local\apache2\htdocs\dir\.htaccess       NOT FOUND
Options: Open  Access: Read
OPEN    C:\usr\local\apache2\htdocs\dir\subdir\.htaccess        NOT FOUND
Options: Open  Access: Read
OPEN    C:\usr\local\apache2\htdocs\dir\subdir\ SUCCESS Options: Open
Directory  Access: 00100001
DIRECTORY       C:\usr\local\apache2\htdocs\dir\subdir\ SUCCESS
FileBothDirectoryInformation: file.html
CLOSE   C:\usr\local\apache2\htdocs\dir\subdir\ SUCCESS 

The following file I/O is from
core default_handler and subsequent clean up

OPEN    C:\usr\local\apache2\htdocs\dir\subdir\file.html        SUCCESS
Options: Open  Access: Read
READ    C:\usr\local\apache2\htdocs\dir\subdir\file.html        SUCCESS
Offset: 0 Length: 60
LOCK    C:\Apache22\logs\access.log     SUCCESS Excl: Yes Offset: 0 Length:
-1
QUERY INFORMATION       C:\Apache22\logs\access.log     SUCCESS Length: 8386
WRITE   C:\Apache22\logs\access.log     SUCCESS Offset: 8386 Length: 88
UNLOCK  C:\Apache22\logs\access.log     SUCCESS Offset: 0 Length: -1
CLOSE   C:\usr\local\apache2\htdocs\dir\subdir\file.html        SUCCESS 

Attachment: request.c.diff
Description: Binary data

Reply via email to