Hey Dude, Thanks for doing this work. I commented on your notes first, and the code-review feedback is follows that.
> Notes: > * The tar streaming now has to write the file to a temporary file > first before streaming. This is an unfortunate side-effect of cherrypy > not providing a file object to write data directly back to the client. > Initially, I considered using a StringIO object, but that would have > bloated the server process quite a bit (memory usage). This effectively negates the intended benefit of streaming the files. The idea was to write the tar stream out to the socket as we read each file so that memory and disk usage would remain low. The approach that you're taking now doesn't need to employ streaming. You could just write the tarfile to a temporary file, serve it, and delete it. I'm not sure that's the right approach, though. Are you sure there's no sneaky way to get CherryPy to write a streaming response with an unknown length? In a previous e-mail you noted that it's possible to run CherryPy behind Apache. The documentation for downloading files seems to hint that using Apache is preferable, at least for this task. (http://www.cherrypy.org/wiki/FileDownload). There has been a fair amount of hand-wringing about how filelist doesn't fit with our architectural principles. Perhaps now would be an opportune time to investigate whether we could switch to sitting behind Apache and simply pipeline our requests for multiple files? To be clear, I'm not implying that you should do that work all by yourself. > * I have not yet done extensive performance testing, nor have I > checked for the threading issues that we previously had. Will you do both before you putback? An extensive test may be above and beyond the call of duty; however, some amount of performance and threading testing would sure be nice. I have one more question that wasn't really in your notes. One of the justifications for switching to CherryPy was that it would simplify our codebase. However, most of the code in the depot seems to have been moved (more, or less) from depot.py to repository.py. Is there some portion of the existing code that you think we could simplify once we switch to using CherryPy? I'm also curious how you see this change improving our ability to add new features to the depot. Perhaps you could give an example of how you see this helping future development? > webrev: > http://cr.opensolaris.org/~swalker/pkg-depot-3/ depot.py: 44 - Would it make sense to check if the import failed and print a message that CherryPy is required to run the depot? catalog.py: I'm a little slow this morning. Why do we need to embed two different methods in catalog.send()? I'm sure it's justified. I'm just having a hard time figuring out why. updatelog.py: 26 - This module is common to the client and the server. Unless I'm being a moron, this import means the client will also depend upon cherrypy being installed. repository.py: Just so I'm sure I understand, what does the @cherrypy.expose decorator do? 79 - If cherrypy handles this automatically, can we assert this condition, or throw some kind of exception if we're processing a function that isn't exposed? 307 - Does serve_file return a 404 if it can't find the file at the path that it has been given? There are a number of places in updatelog and transaction where you refer to the current request or response as cherrypy.request and cherrypy.response, respectively. Are these guaranteed to always be for the HTTP session we're currently processing? Do you know how cherrypy enforces this? Thanks, -j _______________________________________________ pkg-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/pkg-discuss
