Hi Dieter, *,

Dieter Maurer wrote:

>   The "FieldStorage" must be rebuilt, because it can contain
>   file objects. These file objects must be reset as they may
>   have been (partially) read in the previous request.
>   This prevent reusing the previous "FieldStorage".

As you may have seen in my first patch proposal I actually tried to
address this problem, although halfbaken and buggy.  So here's the
next trial.  I hope I've got it straight now.

Why didn't I do something to reset the files, FiledStorage creates?

Because I thought it wouldn't be needed.  If you look at
cgi.FieldStorage you will see, that accessing FieldStorage's 'value'
field, will always reset the file before reading it and HTTPRequest's
'BODY' field does the same.

Cause of your hint I looked again and realised that I was wrong.
HTTPRquest's 'BODYFILE' field fails to reset the file before accessing
it (as in OFS/Image.py the File class does).  That could be easily
fixed.  What's worse is that FiledStorage can possibly create
subinstances of itsellf which can possibly contain other fileobjects.
This is the case for multipart-type requests.


How to solve this?

I can see three alternative aproaches:

1) In HTTPRequest.retry loop through self._fs.list (if it exists) and
   call seek(0) on every fileobject, you can find there.  This can
   only work if the maximum depth of the FiledStorage tree is
   predictable, which in turn depends on the possible input.  I don't
   know enough about the possible HTTP requests to determine such a
   limit or to even 'prove' its existance.

2) In HTTPRequest.retry recursively walk through the FieldStorage
   trees in self._fs.list (if it exists).  I wouldn't do that in
   Python without being able to predict the (worst case) number of
   function calls needed to do this recursion.  And without refreshing
   my knowledge of compiler techniques I wouldn't be able to roll the
   recursion out to a loop.

3) This is a somewhat drastic (and possibly conceived as hackish?)
   approach which OTOH is independent of the input and should perform
   quite well.  I overwrite cgi.FieldStorage, esp. its make_file
   method, such that it adds all fileobjects, that it creates, to a
   (possibly caller provided) list.  Thus the caller
   (i.e. HTTPRequest) has a list of all those fileobjects that need to
   be reset if the FieldStorage should be reused (as in ZPublisher's
   retries).

For the patch attached I chose the third approach.  Please let me
know, what you think about it.  If it is assessed to be ok I will
update the collctor item, I opened yesterday.


cheers,

andreas



--- lib.orig/python/ZPublisher/HTTPRequest.py	2004-08-18 16:37:18.000000000 +0200
+++ lib/python/ZPublisher/HTTPRequest.py	2004-08-19 17:42:31.000000000 +0200
@@ -60,6 +60,37 @@
 class NestedLoopExit( Exception ):
     pass
 
+FILELIST_ENV_ENTRY = 'request.filestorage.filelist'
+
+class ZFieldStorage(cgi.FieldStorage):
+    """\
+    Tame the FieldStorage.
+
+    This class overrides the stock implementation to add all file
+    objects, generated while parsing a request, to a list.  This list
+    is given by the caller.
+
+    Unfortunately cgi.FieldStorage is not generic enough, to adapt the
+    list of constructor parameters, by just deriving from it.  Thus
+    the list is specified within the environ dict, given to the
+    constructor.  This is ugly and could be fixed by a simple patch
+    against cgi.FieldStorage.
+    """
+
+    def __init__(self, fp=None, headers=None, outerboundary="",
+                 environ=os.environ, keep_blank_values=0, strict_parsing=0):
+        self._filelist = environ.get(FILELIST_ENV_ENTRY, [])
+        cgi.FieldStorage.__init__(self, fp, headers, outerboundary,
+                                  environ, keep_blank_values,
+                                  strict_parsing)
+
+    def make_file(self, binary=None):
+        import tempfile
+        res = tempfile.TemporaryFile("w+b")
+        self._filelist.append(res)
+        return res
+
+    
 class HTTPRequest(BaseRequest):
     """\
     Model HTTP request data.
@@ -125,10 +156,15 @@
 
     def retry(self):
         self.retry_count=self.retry_count+1
-        self.stdin.seek(0)
+        if not self._fs:
+            raise ValueError, "Retrying on partially initialized request is not supported.  Call processInputs()."
+        for f in self._fs_filelist:
+            f.seek(0)
         r=self.__class__(stdin=self.stdin,
                          environ=self._orig_env,
-                         response=self.response.retry()
+                         response=self.response.retry(),
+                         fs = self._fs,
+                         flist = self._fs_filelist
                          )
         r.retry_count=self.retry_count
         return r
@@ -138,6 +174,8 @@
         # removing tempfiles.
         self.stdin = None
         self._file = None
+        self._fs = None
+        self._fs_filelist = None
         self.form.clear()
         # we want to clear the lazy dict here because BaseRequests don't have
         # one.  Without this, there's the possibility of memory leaking
@@ -237,7 +275,7 @@
         """
         return self._client_addr
 
-    def __init__(self, stdin, environ, response, clean=0):
+    def __init__(self, stdin, environ, response, clean=0, fs=None, flist=None):
         self._orig_env=environ
         # Avoid the overhead of scrubbing the environment in the
         # case of request cloning for traversal purposes. If the
@@ -261,7 +299,11 @@
         self.steps=[]
         self._steps=[]
         self._lazies={}
-
+        self._fs = fs
+        if flist is None:
+            self._fs_filelist = []
+        else:
+            self._fs_filelist = flist
 
         if environ.has_key('REMOTE_ADDR'):
             self._client_addr = environ['REMOTE_ADDR']
@@ -382,23 +424,26 @@
         taintedform=self.taintedform
 
         meth=None
-        fs=FieldStorage(fp=fp,environ=environ,keep_blank_values=1)
-        if not hasattr(fs,'list') or fs.list is None:
+        if not self._fs:
+            fs_env = environ
+            fs_env[FILELIST_ENV_ENTRY] = self._fs_filelist
+            self._fs=ZFieldStorage(fp=fp,environ=fs_env,keep_blank_values=1)
+        if not hasattr(self._fs,'list') or self._fs.list is None:
             # Hm, maybe it's an XML-RPC
-            if (fs.headers.has_key('content-type') and
-                fs.headers['content-type'] == 'text/xml' and
+            if (self._fs.headers.has_key('content-type') and
+                self._fs.headers['content-type'] == 'text/xml' and
                 method == 'POST'):
                 # Ye haaa, XML-RPC!
                 global xmlrpc
                 if xmlrpc is None: import xmlrpc
-                meth, self.args = xmlrpc.parse_input(fs.value)
+                meth, self.args = xmlrpc.parse_input(self._fs.value)
                 response=xmlrpc.response(response)
                 other['RESPONSE']=self.response=response
                 self.maybe_webdav_client = 0
             else:
-                self._file=fs.file
+                self._file=self._fs.file
         else:
-            fslist=fs.list
+            fslist=self._fs.list
             tuple_items={}
             lt=type([])
             CGI_name=isCGI_NAME
_______________________________________________
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )

Reply via email to