Now the mailing list is a bit quiet, I would like to see if I can get
some explicit feedback on some issues related to the inability to update
the req.finfo attribute.

Grisha, would be nice if you could respond on this issue and give some
guidance else I fear I'll never be able to progress a solution to this
issue. :-(

As explained in:

  http://issues.apache.org/jira/browse/MODPYTHON-128

although it is possible to assign a new value to req.filename, there is
no way to update req.finfo to the file stats associated with that new
value of req.filename. If one had access to the low level C API this
would normally be achieved using:

  apr_stat(&r->finfo, r->filename, APR_FINFO_MIN, r->pool);

In mod_python though, there is no way to access the function and affect
that outcome.

In mod_perl 1.0 they implemented the behaviour whereby the "finfo"
attribute was automatically updated when the "filename" attribute was
updated by a handler.

In mod_perl 2.0 they dropped this though, as they wanted to preserve
the idea that in mod_perl everything behaved exactly like the C API
they were trying to provide a 1 to 1 mapping for. Thus in mod_perl
2.0 you need to write:

  use Apache2::RequestRec ();
  use APR::Finfo ();
  use APR::Const -compile => qw(FINFO_NORM);
  $r->filename($newfile);
$r->finfo(APR::Finfo::stat($newfile, APR::Const::FINFO_NORM, $r->pool));

As mod_python isn't attempting to provide a strict 1 to 1 mapping, it
might be argued that it could do what mod_perl 1.0 did and automatically
updated the "finfo" attribute when "filename" is updated.

The only other alternative is to add a new method to the Python request
object for which there isn't strictly speaking a direct equivalent to
in the Apache C API. That is, a method that calls apr_stat() but which
only performs it in relation to the "filename" and "finfo" attributes
in the request object itself and is not a generic routine.

Since it isn't likely that mod_python will ever provide a lower level
API for use of finfo related structures and functions and even if it
did they most likely would be distinct to the request object, the name
of the function added to the request object could still be called "stat()".

Thus mod_python equivalent to what mod_perl 2.0 does would be:

  req.filename = newfile
  req.stat()

This though doesn't really convey a sense of what it occurring. Thus a
more descriptive name would probably be more appropriate. For example:

  req.filename = newfile
  req.update_finfo()

There is no ap_update_finfo() function now, but if they did later
implement one, this would shadow it and prevent it being added to the
request object if it was pertinent for that be done.

The next problem is that apr_stat() actually takes an argument indicating
what fields in the "finfo" attribute should be updated. In mod_perl 1.0
the value used when the automatic update was done was APR_FINFO_MIN
which results in type, mtime, ctime, atime, size being updated. In
the documentation for mod_perl 2.0 it suggests use of APR_FINFO_NORM
instead which is described as intended to be used when an atomic unix
apr_stat() is required whatever that means.

Important to note though is that is that the ap_directory_walk()
function in Apache which is used to map a URL against a file in the
filesystem uses APR_FINFO_MIN.

Now if a function were to be provided, it seems to make sense that it
have a default whereby it uses APR_FINO_MIN, much as would be the case
if the "finfo" attribute were updated automatically when "filename" is
updated.

Should though a function if provided allow the ability to supply an
alternate for this value so as to be selective as to what attributes
of "finfo" are updated?

If it were allowed, have the problem that there are already attributes
in mod_python for:

  FINFO_MODE = 0
  FINFO_INO = 1
  FINFO_DEV = 2
  FINFO_NLINK = 3
  FINFO_UID = 4
  FINFO_GID = 5
  FINFO_SIZE = 6
  FINFO_ATIME = 7
  FINFO_MTIME = 8
  FINFO_CTIME = 9
  FINFO_FNAME = 10
  FINFO_NAME = 11
  FINFO_FILETYPE = 12

Rather than these being equivalents to the APR constants:

  #define       APR_FINFO_LINK   0x00000001
  #define       APR_FINFO_MTIME   0x00000010
  #define       APR_FINFO_CTIME   0x00000020
  #define       APR_FINFO_ATIME   0x00000040
  #define       APR_FINFO_SIZE   0x00000100
  #define       APR_FINFO_CSIZE   0x00000200
  #define       APR_FINFO_DEV   0x00001000
  #define       APR_FINFO_INODE   0x00002000
  #define       APR_FINFO_NLINK   0x00004000
  #define       APR_FINFO_TYPE   0x00008000
  #define       APR_FINFO_USER   0x00010000
  #define       APR_FINFO_GROUP   0x00020000
  #define       APR_FINFO_UPROT   0x00100000
  #define       APR_FINFO_GPROT   0x00200000
  #define       APR_FINFO_WPROT   0x00400000
  #define       APR_FINFO_ICASE   0x01000000
  #define       APR_FINFO_NAME   0x02000000
  #define       APR_FINFO_MIN   0x00008170
  #define       APR_FINFO_IDENT   0x00003000
  #define       APR_FINFO_OWNER   0x00030000
  #define       APR_FINFO_PROT   0x00700000
  #define       APR_FINFO_NORM   0x0073b170
  #define       APR_FINFO_DIRENT   0x02000000

which can be bit wise or'd together as the argument to the apr_stat()
function, they are used as positional indexes into the tuple which
mod_python provides as req.finfo. Thus there is a clash on the names.

Thus the whole issue gets very messy. :-(

Overall, my feeling is that following what mod_perl 1.0 did of updating
the "finfo" attribute when the "filename" attribute is a reasonable
solution for mod_python given its much higher level API. At the moment
I haven't been able to for-see any problems that this might cause.
The question though is whether that it is hidden is too magic?

Whatever is done, this missing ability of not being able to update
req.finfo needs to be added. One use for it that I already have is to
get around the DirectoryIndex problems in mod_python caused by Apache's
use of the ap_internal_fast_redirect() function to implement that
feature. The specifics of this particular issue are documented under:

  http://issues.apache.org/jira/browse/MODPYTHON-146

One solution I already provided for this was:

  def fixuphandler(req):
      if req.finfo[apache.FINFO_FILETYPE] == apache.APR_DIR:
          if req.uri[-1] == '/':
              uri = req.uri + 'index.html'
              if req.args: uri += '?' + req.args
              req.internal_redirect(uri)
              return apache.DONE
      return apache.OK

Although, I believe now it is probably better done as HeaderParserHandler
phase. That is before any access, authentication and authorization
checks. Those access checks would instead then only be done once in
the sub request instead of multiple times.

If req.finfo were automatically updated when req.filename is updated
then the alternative is to use:

  def headerparserhandler(req):
      if req.finfo[apache.FINFO_FILETYPE] == apache.APR_DIR:
          if req.uri[-1] == '/':
              req.filename = posixpath.join(req.filename, 'index.html')
      return apache.OK

This is possibly better as it avoids the need to perform an internal
redirect and thus is slightly more efficient. It will only work though
if req.finfo is updated as that will result in the file type changing
from that of a directory to that of a regular file, that value being
later consulted by type handler of mod_mime when setting up the content
type.

Another area where it will be important to be able to update req.finfo
is when a map to storage hook is implemented as described in:

  http://issues.apache.org/jira/browse/MODPYTHON-123

Without the ability, there would be no point providing the new hook as
it wouldn't be possibly to correctly set up the request object for latter
phases without it.

Anyway got anything to contribute as to an opinion or otherwise?

To summarise the problem, we need a way of updating req.finfo when
req.filename changes. How should this be done?

I would like to get this sorted sooner rather than later. This also will
not be the last complicated little issue a decision has to be made on,
and I would rather not see a lack of consensus on these issues pushing
delivery of 3.3 further and further into the future. I also would rather
not see us putting them in to the too hard basket for a later release.

Feedback much appreciated.

Graham


Reply via email to